Skip to content

fix: replace deprecated kube-rbac-proxy with controller-runtime authn/authz#443

Merged
diranged merged 3 commits intodiranged:mainfrom
schahal:remove-kube-rbac-proxy
Mar 11, 2026
Merged

fix: replace deprecated kube-rbac-proxy with controller-runtime authn/authz#443
diranged merged 3 commits intodiranged:mainfrom
schahal:remove-kube-rbac-proxy

Conversation

@schahal
Copy link
Copy Markdown
Contributor

@schahal schahal commented Mar 10, 2026

Background

The gcr.io/kubebuilder/kube-rbac-proxy image is deprecated and Google Container Registry (GCR) is being sunset, making the image unavailable. This project used kube-rbac-proxy as a sidecar container to secure the /metrics endpoint via Kubernetes TokenReview and SubjectAccessReview. See kubernetes-sigs/cluster-api-addon-provider-helm#318 for upstream context.

Changes

Instead of finding an alternative image, this replaces the sidecar pattern entirely with controller-runtime's built-in SecureServing and WithAuthenticationAndAuthorization filter. The manager now serves metrics securely over HTTPS on port 8443 with native authn/authz — no sidecar needed. The kube-rbac-proxy container, its image configuration, and related Helm values have been removed from both the kustomize config and the Helm chart. The existing RBAC resources (TokenReview/SubjectAccessReview permissions and metrics-reader ClusterRole) are retained since the manager now performs these checks itself.

Testing

Deployed to a local KIND cluster and verified the pod runs with a single container (no sidecar). Confirmed that unauthenticated requests to the metrics endpoint return Unauthorized. Verified that authenticated requests without the metrics-reader ClusterRole return Authorization denied. Confirmed that authenticated requests with the metrics-reader role bound successfully return Prometheus metrics.

🤖 Generated with Claude Code

…/authz

The gcr.io/kubebuilder/kube-rbac-proxy image is deprecated and GCR is
being sunset. Replace the kube-rbac-proxy sidecar with controller-runtime's
built-in SecureServing and WithAuthenticationAndAuthorization filter, which
provides the same TokenReview/SubjectAccessReview security model natively.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added documentation Improvements or additions to documentation go Pull requests that update Go code repo build labels Mar 10, 2026
@schahal
Copy link
Copy Markdown
Contributor Author

schahal commented Mar 10, 2026

Testing on my KIND cluster, after port-forwarding the metrics endpoint:

Expected:

➜  ~ curl -k https://localhost:8443/metrics
Unauthorized

Expected, (authenticated but no RBAC)

➜  ~ (⎈|kind-default:oz-system) TOKEN=$(kubectl create token oz-controller-manager -n oz-system)
➜  ~ (⎈|kind-default:oz-system) curl -k -H "Authorization: Bearer $TOKEN" https://localhost:8443/metrics
Authorization denied for user system:serviceaccount:oz-system:oz-controller-manager

Expected (after creating clusterrolebinding for "metrics-reader"):

➜  ~ (⎈|kind-default:oz-system) TOKEN=$(kubectl create token oz-controller-manager -n oz-system)
➜  ~ (⎈|kind-default:oz-system) curl -k -H "Authorization: Bearer $TOKEN" https://localhost:8443/metrics
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
...

@schahal schahal marked this pull request as ready for review March 10, 2026 18:11
@schahal schahal requested a review from diranged as a code owner March 10, 2026 18:11
@coveralls
Copy link
Copy Markdown

coveralls commented Mar 10, 2026

Pull Request Test Coverage Report for Build 22933760501

Details

  • 0 of 10 (0.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.1%) to 35.907%

Changes Missing Coverage Covered Lines Changed/Added Lines %
internal/cmd/manager/main.go 0 10 0.0%
Totals Coverage Status
Change from base Build 22933500604: -0.1%
Covered Lines: 1037
Relevant Lines: 2888

💛 - Coveralls

@schahal
Copy link
Copy Markdown
Contributor Author

schahal commented Mar 10, 2026

@diranged the CI job is probably failing because my source branch is on a fork which probably can't upload dep graphs (I didn't have write/push permissions on this repo so I had to fork).

But I'm guessing those are non-required checks?

The building and local testing all seemed to pass.

Copy link
Copy Markdown
Owner

@diranged diranged left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid fix.. the premise is right - kube-rbac-proxy is deprecated and GCR is sunsetting, so this needs to happen. Using controller-runtime's built-in SecureServing + WithAuthenticationAndAuthorization is the officially recommended migration path, so this is the right approach.

Two things to be aware of:

  1. The dependency-review CI failure is from go.opentelemetry.io/otel/sdk@1.36.0 (GHSA-9h8m-3fm2-qjrq) - it's a transitive dep pulled in by k8s.io/apiserver, so not really actionable here. Just worth tracking until upstream bumps it.

  2. The metrics port name is now hardcoded to https instead of being dynamic from values.. shouldn't matter since the kube-rbac-proxy values are gone, but if anyone had ServiceMonitor configs referencing the old port name by value they'd need to update.

E2E tests pass across six K8s versions, helm-test passes, lint is clean.. ship it.

diranged added a commit that referenced this pull request Mar 11, 2026
…rkflow (#444)

## Summary
- The `godeps.yaml` workflow triggers on `pull_request` events, but the
`actions/go-dependency-submission` action requires `contents: write`
permission to submit dependency snapshots
- On `pull_request` events, the `GITHUB_TOKEN` is read-only, causing
"Resource not accessible by integration" errors (e.g. [PR
#443](#443))
- Dependency snapshots only need to be submitted when code lands on
`main`, so the `pull_request` trigger is removed

## Test plan
- [ ] Verify CI passes on this PR (the `go-action-detection` job should
no longer run)
- [ ] Verify the workflow still runs on push to `main` after merge

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@diranged
Copy link
Copy Markdown
Owner

@schahal The other failure is actually a security failure on a dependency that was introducd (otel 1.36)... im upgrading it to 1.40 which should be clear.

Resolves CVE-2026-24051 (GHSA-9h8m-3fm2-qjrq): OpenTelemetry Go SDK
v1.21.0-v1.39.0 is vulnerable to arbitrary code execution via PATH
hijacking on macOS/Darwin systems.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@diranged diranged merged commit af8f5fe into diranged:main Mar 11, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build documentation Improvements or additions to documentation go Pull requests that update Go code repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants